\(^1\) GenomEast platform, IGBMC
For this training, we will use two datasets:
The data are publicly available in GEO under the accession number GSE59572. It contains two subseries:
Bioinformatics tools will be run through the french instance of Galaxy, Galaxy France in order to analyzed the data.
The Genome browser IGV will be used to visualize the data in a genomics context.
Biojupies will be used to run the differential expression analysis.
Galaxy is a tool that allow users to run bioinformatics tools on a high performance computing cluster through a simple web interface. We are going to use the french instance of Galaxy, Galaxy France.
Go to Galaxy France website: https://usegalaxy.fr/ and log in with your personal account.
Data analyzed during this training are available in a public history: https://usegalaxy.fr/u/stephanie/h/neuro-epigenetics-training-data. Import this history.
The datasets are in the imported history “Imported: Neuro-epigenetics training (data)”.
Click on the down sided arrow on the top right of your history
panel and select “Show History Side-by-Side”
Drag and drop the datasets
R6_1_387_St.chr19.fastq.gz and
WT_320_St.chr19.fastq.gz from imported history to the
working one.
Analysis of RNA-seq data will be run with the following steps:
Tool: FastQC
Website: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
Citation: Andrews, S. (2010). FastQC: A Quality Control Tool for High Throughput Sequence Data [Online]. Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/
Use of the tool: It is used to assess the quality of high throughput sequencing data. The tool takes raw sequencing data (fastq files) or mapping results (BAM, SAM files) and generates a HTML report that gives a quick impression with summary graphs of the quality of the data.
Tool: STAR
Documentation: https://github.com/alexdobin/STAR/blob/master/doc/STARmanual.pdf
Citation: Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013 Jan 1;29(1):15-21. doi: 10.1093/bioinformatics/bts635. Epub 2012 Oct 25. PMID: 23104886; PMCID: PMC3530905.
Use of the tool: It map RNA-seq data to the reference genome really fast. It uses known transcript junction information to align reads but can also discover new splice junction events.
The dataset is in the imported history “Imported: Neuro-epigenetics training (data)”.
Click on the down sided arrow on the top right of your history
panel and select “Show History Side-by-Side”
Drag and drop the dataset Mus_musculus.NCBIM37.67_UCSConlychr.gtf from imported history to working one.
As mapping is a long processing step, mapping data are provided in the imported history “Imported: Neuro-epigenetics training (data)”.
Tool: Deeptools bamCoverage
Documentation: https://deeptools.readthedocs.io/en/develop/content/tools/bamCoverage.html
Citation: Ramírez, Fidel, Devon P. Ryan, Björn Grüning, Vivek Bhardwaj, Fabian Kilpert, Andreas S. Richter, Steffen Heyne, Friederike Dündar, and Thomas Manke. deepTools2: A next Generation Web Server for Deep-Sequencing Data Analysis. Nucleic Acids Research (2016). doi:10.1093/nar/gkw257.
Use of the tool: This is suite of tools is meant to handle next generation sequencing data especially ChIP-seq and RNA-seq data. Some tools can create plots useful to have global views at the data.
Do it for the two files WT_320_St.chr19.bam and R6_1_387_St.chr19.bam
Do it for the two result datasets.
Note: file bam.bai should be in the same directory as bam files otherwise they won’t be loaded!
In IGV menu:
Select bam files and bigwig files.
You should get:
It has been downloaded from GEO. It is available in the file data/GSE59571_S13113_readCounts.xlsx. We are going to run a differential expression analysis on these data.
Tool: Biojupies
Website: https://maayanlab.cloud/biojupies/
Documentation: https://maayanlab.cloud/biojupies/help
Citation: Torre D, Lachmann A, Ma’ayan A. BioJupies: Automated Generation of Interactive Notebooks for RNA-Seq Data Analysis in the Cloud. Cell Syst. 2018 Nov 28;7(5):556-561.e3. doi: 10.1016/j.cels.2018.10.007. Epub 2018 Nov 14. PMID: 30447998; PMCID: PMC6265050.
Use of the tool: BioJupies is a web application that enables the RNA-seq data analyses. Through an intuitive interface, users can rapidly generate tailored reports to analyze and visualize their own raw sequencing files, gene expression tables, or fetch data from >9,000 published studies containing >300,000 preprocessed RNA-seq samples.
The created notebook is available here: https://maayanlab.cloud/biojupies/notebook/3bDxb3Opy or click to run the analysis report.